149 research outputs found
A Deep Sequential Model for Discourse Parsing on Multi-Party Dialogues
Discourse structures are beneficial for various NLP tasks such as dialogue
understanding, question answering, sentiment analysis, and so on. This paper
presents a deep sequential model for parsing discourse dependency structures of
multi-party dialogues. The proposed model aims to construct a discourse
dependency tree by predicting dependency relations and constructing the
discourse structure jointly and alternately. It makes a sequential scan of the
Elementary Discourse Units (EDUs) in a dialogue. For each EDU, the model
decides to which previous EDU the current one should link and what the
corresponding relation type is. The predicted link and relation type are then
used to build the discourse structure incrementally with a structured encoder.
During link prediction and relation classification, the model utilizes not only
local information that represents the concerned EDUs, but also global
information that encodes the EDU sequence and the discourse structure that is
already built at the current step. Experiments show that the proposed model
outperforms all the state-of-the-art baselines.Comment: Accepted to AAAI 201
Robustness to Modification with Shared Words in Paraphrase Identification
Revealing the robustness issues of natural language processing models and
improving their robustness is important to their performance under difficult
situations. In this paper, we study the robustness of paraphrase identification
models from a new perspective -- via modification with shared words, and we
show that the models have significant robustness issues when facing such
modifications. To modify an example consisting of a sentence pair, we either
replace some words shared by both sentences or introduce new shared words. We
aim to construct a valid new example such that a target model makes a wrong
prediction. To find a modification solution, we use beam search constrained by
heuristic rules, and we leverage a BERT masked language model for generating
substitution words compatible with the context. Experiments show that the
performance of the target models has a dramatic drop on the modified examples,
thereby revealing the robustness issue. We also show that adversarial training
can mitigate this issue.Comment: Findings of EMNLP 202
Robustly Leveraging Prior Knowledge in Text Classification
Prior knowledge has been shown very useful to address many natural language
processing tasks. Many approaches have been proposed to formalise a variety of
knowledge, however, whether the proposed approach is robust or sensitive to the
knowledge supplied to the model has rarely been discussed. In this paper, we
propose three regularization terms on top of generalized expectation criteria,
and conduct extensive experiments to justify the robustness of the proposed
methods. Experimental results demonstrate that our proposed methods obtain
remarkable improvements and are much more robust than baselines
From One Point to A Manifold: Knowledge Graph Embedding For Precise Link Prediction
Knowledge graph embedding aims at offering a numerical knowledge
representation paradigm by transforming the entities and relations into
continuous vector space. However, existing methods could not characterize the
knowledge graph in a fine degree to make a precise prediction. There are two
reasons: being an ill-posed algebraic system and applying an overstrict
geometric form. As precise prediction is critical, we propose an manifold-based
embedding principle (\textbf{ManifoldE}) which could be treated as a well-posed
algebraic system that expands the position of golden triples from one point in
current models to a manifold in ours. Extensive experiments show that the
proposed models achieve substantial improvements against the state-of-the-art
baselines especially for the precise prediction task, and yet maintain high
efficiency.Comment: arXiv admin note: text overlap with arXiv:1509.0548
An Interpretable Reasoning Network for Multi-Relation Question Answering
Multi-relation Question Answering is a challenging task, due to the
requirement of elaborated analysis on questions and reasoning over multiple
fact triples in knowledge base. In this paper, we present a novel model called
Interpretable Reasoning Network that employs an interpretable, hop-by-hop
reasoning process for question answering. The model dynamically decides which
part of an input question should be analyzed at each hop; predicts a relation
that corresponds to the current parsed results; utilizes the predicted relation
to update the question representation and the state of the reasoning process;
and then drives the next-hop reasoning. Experiments show that our model yields
state-of-the-art results on two datasets. More interestingly, the model can
offer traceable and observable intermediate predictions for reasoning analysis
and failure diagnosis, thereby allowing manual manipulation in predicting the
final answer.Comment: COLING 2018, 13page
SSP: Semantic Space Projection for Knowledge Graph Embedding with Text Descriptions
Knowledge representation is an important, long-history topic in AI, and there
have been a large amount of work for knowledge graph embedding which projects
symbolic entities and relations into low-dimensional, real-valued vector space.
However, most embedding methods merely concentrate on data fitting and ignore
the explicit semantic expression, leading to uninterpretable representations.
Thus, traditional embedding methods have limited potentials for many
applications such as question answering, and entity classification. To this
end, this paper proposes a semantic representation method for knowledge graph
\textbf{(KSR)}, which imposes a two-level hierarchical generative process that
globally extracts many aspects and then locally assigns a specific category in
each aspect for every triple. Since both aspects and categories are
semantics-relevant, the collection of categories in each aspect is treated as
the semantic representation of this triple. Extensive experiments justify our
model outperforms other state-of-the-art baselines substantially.Comment: Submitted to AAAI.201
Story Ending Generation with Incremental Encoding and Commonsense Knowledge
Generating a reasonable ending for a given story context, i.e., story ending
generation, is a strong indication of story comprehension. This task requires
not only to understand the context clues which play an important role in
planning the plot but also to handle implicit knowledge to make a reasonable,
coherent story.
In this paper, we devise a novel model for story ending generation. The model
adopts an incremental encoding scheme to represent context clues which are
spanning in the story context. In addition, commonsense knowledge is applied
through multi-source attention to facilitate story comprehension, and thus to
help generate coherent and reasonable endings. Through building context clues
and using implicit knowledge, the model is able to produce reasonable story
endings. context clues implied in the post and make the inference based on it.
Automatic and manual evaluation shows that our model can generate more
reasonable story endings than state-of-the-art baselines.Comment: Accepted in AAAI201
Modeling Rich Contexts for Sentiment Classification with LSTM
Sentiment analysis on social media data such as tweets and weibo has become a
very important and challenging task. Due to the intrinsic properties of such
data, tweets are short, noisy, and of divergent topics, and sentiment
classification on these data requires to modeling various contexts such as the
retweet/reply history of a tweet, and the social context about authors and
relationships. While few prior study has approached the issue of modeling
contexts in tweet, this paper proposes to use a hierarchical LSTM to model rich
contexts in tweet, particularly long-range context. Experimental results show
that contexts can help us to perform sentiment classification remarkably
better
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation
Despite the success of existing referenced metrics (e.g., BLEU and
MoverScore), they correlate poorly with human judgments for open-ended text
generation including story or dialog generation because of the notorious
one-to-many issue: there are many plausible outputs for the same input, which
may differ substantially in literal or semantics from the limited number of
given references. To alleviate this issue, we propose UNION, a learnable
unreferenced metric for evaluating open-ended story generation, which measures
the quality of a generated story without any reference. Built on top of BERT,
UNION is trained to distinguish human-written stories from negative samples and
recover the perturbation in negative stories. We propose an approach of
constructing negative samples by mimicking the errors commonly observed in
existing NLG models, including repeated plots, conflicting logic, and
long-range incoherence. Experiments on two story datasets demonstrate that
UNION is a reliable measure for evaluating the quality of generated stories,
which correlates better with human judgments and is more generalizable than
existing state-of-the-art metrics.Comment: Long paper; Accepted by EMNLP202
TransA: An Adaptive Approach for Knowledge Graph Embedding
Knowledge representation is a major topic in AI, and many studies attempt to
represent entities and relations of knowledge base in a continuous vector
space. Among these attempts, translation-based methods build entity and
relation vectors by minimizing the translation loss from a head entity to a
tail one. In spite of the success of these methods, translation-based methods
also suffer from the oversimplified loss metric, and are not competitive enough
to model various and complex entities/relations in knowledge bases. To address
this issue, we propose \textbf{TransA}, an adaptive metric approach for
embedding, utilizing the metric learning ideas to provide a more flexible
embedding method. Experiments are conducted on the benchmark datasets and our
proposed method makes significant and consistent improvements over the
state-of-the-art baselines
- …